Recurrent Neural Network Language Model Adaptation Derived Document Vector

نویسندگان

  • Wei Li
  • Brian Kan
  • Wing Mak
چکیده

In many natural language processing (NLP) tasks, a document is commonly modeled as a bag of words using the term frequencyinverse document frequency (TF-IDF) vector. One major shortcoming of the frequencybased TF-IDF feature vector is that it ignores word orders that carry syntactic and semantic relationships among the words in a document, and they can be important in some NLP tasks such as genre classification. This paper proposes a novel distributed vector representation of a document: a simple recurrentneural-network language model (RNN-LM) or a long short-term memory RNN language model (LSTM-LM) is first created from all documents in a task; some of the LM parameters are then adapted by each document, and the adapted parameters are vectorized to represent the document. The new document vectors are labeled as DV-RNN and DV-LSTM respectively. We believe that our new document vectors can capture some high-level sequential information in the documents, which other current document representations fail to capture. The new document vectors were evaluated in the genre classification of documents in three corpora: the Brown Corpus, the BNC Baby Corpus and an artificially created Penn Treebank dataset. Their classification performances are compared with the performance of TF-IDF vector and the state-of-the-art distributed memory model of paragraph vector (PV-DM). The results show that DV-LSTM significantly outperforms TF-IDF and PV-DM in most cases, and combinations of the proposed document vectors with TF-IDF or PVDM may further improve performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Derivation of Document Vectors from Adaptation of LSTM Language Model

In many natural language processing tasks, a document is commonly modeled as a bag of words using the term frequency-inverse document frequency (TF-IDF) vector. One major shortcoming of the TF-IDF feature vector is that it ignores word orders that carry syntactic and semantic relationships among the words in a document. This paper proposes a novel distributed vector representation of a document...

متن کامل

A Recurrent Neural Network Model for solving CCR Model in Data Envelopment Analysis

In this paper, we present a recurrent neural network model for solving CCR Model in Data Envelopment Analysis (DEA). The proposed neural network model is derived from an unconstrained minimization problem. In the theoretical aspect, it is shown that the proposed neural network is stable in the sense of Lyapunov and globally convergent to the optimal solution of CCR model. The proposed model has...

متن کامل

Document Embeddings via Recurrent Language Models

Document embeddings serve to supply richer semantic content for downstream tasks which require fixed length inputs. We propose a novel unsupervised framework by which to train document vectors by using a modified Recurrent Neural Network Language Model, which we call DRNNLM, incorporating a document vector into the calculation of the hidden state and prediction at each time step. Our goal is to...

متن کامل

A Recurrent Neural Network to Identify Efficient Decision Making Units in Data Envelopment Analysis

In this paper we present a recurrent neural network model to recognize efficient Decision Making Units(DMUs) in Data Envelopment Analysis(DEA). The proposed neural network model is derived from an unconstrained minimization problem. In theoretical aspect, it is shown that the proposed neural network is stable in the sense of lyapunov and globally convergent. The proposed model has a single-laye...

متن کامل

Fast Gated Neural Domain Adaptation: Language Model as a Case Study

Neural network training has been shown to be advantageous in many natural language processing applications, such as language modelling or machine translation. In this paper, we describe in detail a novel domain adaptation mechanism in neural network training. Instead of learning and adapting the neural network on millions of training sentences – which can be very timeconsuming or even infeasibl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1611.00196  شماره 

صفحات  -

تاریخ انتشار 2016